13. Conclusions Using Groupby

Drawing Conclusions Using Groupby

In the notebook below, you're going to investigate two questions about this data using pandas' groupby function. Here are tips for answering each question:

Q1: Is a certain type of wine (red or white) associated with higher quality?

For this question, compare the average quality of red wine with the average quality of white wine with groupby. To do this group by color and then find the mean quality of each group.

Q2: What level of acidity (pH value) receives the highest average rating?

This question is more tricky because unlike color , which has clear categories you can group by (red and white) pH is a quantitative variable without clear categories. However, there is a simple fix to this. You can create a categorical variable from a quantitative variable by creating your own categories. pandas' cut function let's you "cut" data in groups. Using this, create a new column called acidity_levels with these categories:

Acidity Levels:

  1. High: Lowest 25% of pH values
  2. Moderately High: 25% - 50% of pH values
  3. Medium: 50% - 75% of pH values
  4. Low: 75% - max pH value

Here, the data is being split at the 25th, 50th, and 75th percentile. Remember, you can get these numbers with pandas' describe() ! After you create these four categories, you'll be able to use groupby to get the mean quality rating for each acidity level.

Workspace

This section contains either a workspace (it can be a Jupyter Notebook workspace or an online code editor work space, etc.) and it cannot be automatically downloaded to be generated here. Please access the classroom with your account and manually download the workspace to your local machine. Note that for some courses, Udacity upload the workspace files onto https://github.com/udacity , so you may be able to download them there.

Workspace Information:

  • Default file path:
  • Workspace type: jupyter
  • Opened files (when workspace is loaded): n/a

Is the mean quality of red wine greater than, less than, or equal to that of white wine?

SOLUTION: Less

What level of acidity receives the highest mean quality rating?

SOLUTION: Low